Update Response Parsing Protocol Tests #3247

jonathan343 · 2024-09-06T15:41:22Z

Note: Request serialization tests will be added in a follow up PR in effort to reduce the scope of this one.

Request Serialization PR: Update Request Serialization Protocol Tests #3378

Summary

This PR replaces the existing response parsing protocol tests in botocore/tests/unit/protocols/output with new tests generated from Smithy protocol test models. We use a version of these Smithy tests that are converted to the format currently supported by our existing test runner (test_protocols.py). This PR makes updates to the botocore Parsers and protocol test runner to comply with a majority of the new test cases.

Granular Protocol Ignore List

This PR introduces a protocol-tests-ignore-list.json file that can be used to ignore specific protocol test suites or cases.
The structure for this list is defined below:

{
  "general": {
    "<test_type>": {
      "suites": [
        "Example test suite description",
      ],
      "cases": [
        "example-test-case-id"
      ]
    }
  },
  "protocols": {
    "<protocol_name>": {
      # Same as "general"  
    }
  }
}

<test_type> - There are two types of protocol tests.
- input - Request serialization protocol test.
- output - Response parsing protocol test.
- Note: Only output will work until the request serialization tests are updated.
suites - A list of test suite descriptions to ignore. When specified, all test cases for the related suite will be ignored.
cases - A list of test case ids to ignore.
<protocol_name> - The protocol name as represented by its corresponding protocol test file name (without the .json) extension.
- Supported values are ec2, json, json_1_0, query, rest-json, rest-xml. This may grow as we add more protocols.

codecov-commenter · 2024-09-06T15:47:58Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (develop@19140a9). Learn more about missing BASE report.
Report is 199 commits behind head on develop.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #3247   +/-   ##
==========================================
  Coverage           ?   93.05%           
==========================================
  Files              ?       66           
  Lines              ?    14507           
  Branches           ?        0           
==========================================
  Hits               ?    13500           
  Misses             ?     1007           
  Partials           ?        0

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

nateprewitt

Handful of questions about the current implementation.

nateprewitt · 2024-09-06T19:53:21Z

botocore/parsers.py

+            cleaned_value = {
+                k: v for k, v in cleaned_value.items() if v is not None
+            }


I'm not sure I understand how we're getting to a state where we have None values arriving here. Do we have an example of what that case might look like?

The rust SDK has experienced issues (aws-sdk-rust#1095) when parsing responses from services like quicksight who populate additional union members with null values. Although the quicksight case doesn't specifically affect boto3 due to differences in modeled shape between Smithy and C2J, it’s possible for any other service to send us this type of response for a union structure.

Related Protocol Test Id: AwsJson10DeserializeAllowNulls

botocore/parsers.py

nateprewitt · 2024-09-06T20:06:53Z

botocore/parsers.py

@@ -577,7 +590,7 @@ def _do_parse(self, response, shape):
        return self._parse_body_as_xml(response, shape, inject_metadata=True)

    def _parse_body_as_xml(self, response, shape, inject_metadata=True):
-        xml_contents = response['body']
+        xml_contents = response['body'] or b'<xml/>'


Is there something unique about this case or should this be handled farther down in _parse_xml_string_to_dom? I'm curious what valid cases we're expecting the service to send us a body and we're getting nothing.

Is there something unique about this case or should this be handled farther down in _parse_xml_string_to_dom?

The RestXMLParser handles this similarly by checking for an empty string before passing it to _parse_xml_string_to_dom:

def _initial_body_parse(self, xml_string): if not xml_string: return ETree.Element('') return self._parse_xml_string_to_dom(xml_string)

It would make sense to move this logic into _parse_xml_string_to_dom to handle both the restxml and query cases.

However, you bring up a good question:

I'm curious what valid cases we're expecting the service to send us a body and we're getting nothing.

The case the protocol tests are handling is if a service doesn't define output members, it expects nothing in the body. REST-XML services send us an empty body, however, query services send us something similar to the following as opposed to an empty body:

<?xml version="1.0"?> <DeleteDomainResponse xmlns="http://sdb.amazonaws.com/doc/2009-04-15/"> <ResponseMetadata> <RequestId>c192b7ed-2f6f-ad6b-1a38-da6dce8d7bdb</RequestId> <BoxUsage>0.0055590278</BoxUsage> </ResponseMetadata> </DeleteDomainResponse>

Conclusion
For now, I think I can move this logic into the test runner instead of our parser until we have known cases where this happens or get more guidance from smithy if this case should actually be handled.

nateprewitt · 2024-09-06T20:11:11Z

botocore/parsers.py

            if '#' in code:
-                code = code.rsplit('#', 1)[1]
+                code = code.split('#', 1)[1]


This is semantically different. Are we enforcing that the code is everything following the first #?

If a : character is present, then take only the contents before the first : character in the value.
If a # character is present, then take only the contents after the first # character in the value.

I changed this to try and follow the smithy guidance as close as possible. I'm not aware of any cases where there are multiple # in the protocol tests or services. However, to prevent any unwanted changes in behavior, I'll revert this back to our existing behavior.

nateprewitt · 2024-09-06T20:16:35Z

botocore/parsers.py

+            if ':' in code:
+                code = code.split(':', 1)[0]


There are now 3 distinct if statements here that aren't using elif/else. Is the intent to continually mutate the code value as it goes through each check?

e.g.

com.test.mypackage#this:that#1 -- split(':', 1) --> com.test.mypackage#this
com.test.mypackage# -- split('#', 1) --> this

Then this is evaluated for the x-amzn-query-error code?

Yup that's correct. Smithy provides the following examples:

All of the following error values resolve to FooError:

FooError
FooError:http://internal.amazon.com/coral/com.amazon.coral.validate/
aws.protocoltests.restjson#FooError
aws.protocoltests.restjson#FooError:http://internal.amazon.com/coral/com.amazon.coral.validate/

nateprewitt · 2024-09-06T20:18:37Z

botocore/parsers.py

+        if code is not None:
+            if ':' in code:
+                code = code.split(':', 1)[0]
+            if '#' in code:
+                code = code.split('#', 1)[1]
+            error['Error']['Code'] = code


It looks like we're changing behavior again here. Is there additional scope, or why are we post-processing the body for the code case now?

nateprewitt · 2024-09-06T20:23:12Z

tests/unit/test_protocols.py

+PROTOCOL_TEST_IGNORE_LIST_PATH = os.path.join(TEST_DIR, IGNORE_LIST_FILENAME)
+with open(PROTOCOL_TEST_IGNORE_LIST_PATH) as f:
+    PROTOCOL_TEST_IGNORE_LIST = json.load(f)


Do we use PROTOCOL_TEST_IGNORE_LIST_PATH somewhere else? Should this whole thing be a self-contained function or pytest fixture instead?

I'll add this to a get_protocol_test_ignore_list function as shown below:

def get_protocol_test_ignore_list(): ignore_list_path = os.path.join(TEST_DIR, IGNORE_LIST_FILENAME) with open(ignore_list_path) as f: return json.load(f)

nateprewitt · 2024-09-06T20:24:54Z

tests/unit/test_protocols.py

+        # If a test case doesn't define a response body, set it to `None`.
+        if 'body' in case['response']:
+            body_bytes = case['response']['body'].encode('utf-8')
+            case['response']['body'] = body_bytes
+        else:
+            case['response']['body'] = b''


Where are we setting the body to None? It looks like our fallback is empty bytes. Is this what's causing issues with the xml test above?

nateprewitt · 2024-09-06T20:29:01Z

tests/unit/test_protocols.py

@@ -462,7 +503,7 @@ def _get_suite_test_id():
        if len(split) == 2:
            suite_id, test_id = int(split[0]), int(split[1])
        else:
-            suite_id = int(split([0]))
+            suite_id = int(split[0])


Is this code even reachable? How did this work before?

At the top of this file, we provide guidance for running a specific test suite:

To run a single test suite you can set the BOTOCORE_TEST_ID env var:

BOTOCORE_TEST=tests/unit/protocols/input/json.json BOTOCORE_TEST_ID=5
pytest tests/unit/test_protocols.py

However, if you follow this example you'll receive the following error:

TypeError: Invalid format for BOTOCORE_TEST_ID, should be suite_id[:test_id], and both values should be integers.

The else condition here is intended to handle this case by but instead fails due to the incorrect syntax.

tests/unit/protocols/output/ec2.json

* Update ignore list * Resolve some parser issues

jonathan343 · 2025-01-10T15:06:09Z

botocore/parsers.py

+            cleaned_value = {
+                k: v for k, v in cleaned_value.items() if v is not None
+            }


The rust SDK has experienced issues (aws-sdk-rust#1095) when parsing responses from services like quicksight who populate additional union members with null values. Although the quicksight case doesn't specifically affect boto3 due to differences in modeled shape between Smithy and C2J, it’s possible for any other service to send us this type of response for a union structure.

Related Protocol Test Id: AwsJson10DeserializeAllowNulls

jonathan343 · 2025-01-10T17:58:28Z

botocore/parsers.py

+            serialized_member_names = [
+                shape.members[member].serialization.get('name', member)
+                for member in shape.members
+            ]
+            if tag not in serialized_member_names:


When parsing union structure members, botocore will use the raw member name from the HTTP response to determine if the member is know or not. If unknown, the union is populated with the following member:

“SDK_UNKNOWN_MEMBER”: { “name”: “<unknown member name>” }

Botocore fails to consider union members modeled with a locationName which is used to provide an alias for a member’s name.

Solution: When determining if a union member is know, we should consider serialized names for members modeled with the locationName trait.

Realted Protocol Test Ids: PostUnionWithJsonNameResponse1, PostUnionWithJsonNameResponse2

tests/unit/protocols/protocol-tests-ignore-list.json

botocore/parsers.py

These will fail until #boto#3247 is merged and this is properly rebased. This is intentional - because there's this parallel work going on, this is an easy way to make sure we don't forget the step of merging the protocol tests

jonathan343 requested review from nateprewitt, SamRemis and alexgromero September 6, 2024 18:10

nateprewitt reviewed Sep 6, 2024

View reviewed changes

jonathan343 added 8 commits January 8, 2025 15:34

Add smithy generated response parsing protocol tests

9b7cb32

Fix minor typos in test_protocols.py

568fe01

Implement granular protocol tests ignore list

244f146

Implement granular protocol tests ignore list

f9acd2b

* Support old ignore list for serializer tests

72d05f1

* Update ignore list * Resolve some parser issues

Run formatter

8a12dbe

Fix docstring spacing.

69ec28a

Update to the latest protocol test models

5d9bb32

jonathan343 force-pushed the update-output-protocol-tests branch from 57d7547 to 5d9bb32 Compare January 9, 2025 21:07

jonathan343 commented Jan 10, 2025

View reviewed changes

jonathan343 added 2 commits February 3, 2025 10:38

Clean up and PR feedback.

843f61c

Merge branch 'develop' into update-output-protocol-tests

50465e9

jonathan343 requested a review from nateprewitt February 3, 2025 15:58

SamRemis reviewed Feb 5, 2025

View reviewed changes

tests/unit/protocols/protocol-tests-ignore-list.json Show resolved Hide resolved

botocore/parsers.py Outdated Show resolved Hide resolved

jonathan343 mentioned this pull request Feb 9, 2025

Update Protocol Tests #3227

Closed

jonathan343 added 2 commits February 10, 2025 16:44

Bring back legacy protocol tests to ensure no coverage loss

d80d012

remove duplicate input tests

8f4656e

SamRemis approved these changes Feb 11, 2025

View reviewed changes

SamRemis mentioned this pull request Feb 13, 2025

Rpcv2 cbor protocol support #3391

Draft

jonathan343 added 2 commits February 13, 2025 14:47

revert changes and ignore the IgnoreQueryParamsInResponse case

433eab9

Merge branch 'develop' into update-output-protocol-tests

8c5be35

SamRemis approved these changes Feb 13, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Response Parsing Protocol Tests #3247

Update Response Parsing Protocol Tests #3247

jonathan343 commented Sep 6, 2024 •

edited

Loading

codecov-commenter commented Sep 6, 2024 •

edited

Loading

nateprewitt left a comment

nateprewitt Sep 6, 2024

jonathan343 Jan 10, 2025

nateprewitt Sep 6, 2024

jonathan343 Jan 29, 2025

nateprewitt Sep 6, 2024

jonathan343 Jan 29, 2025

nateprewitt Sep 6, 2024

jonathan343 Jan 29, 2025

nateprewitt Sep 6, 2024

nateprewitt Sep 6, 2024

jonathan343 Jan 28, 2025

nateprewitt Sep 6, 2024

nateprewitt Sep 6, 2024

jonathan343 Jan 28, 2025

jonathan343 Jan 10, 2025

jonathan343 Jan 10, 2025

Update Response Parsing Protocol Tests #3247

Are you sure you want to change the base?

Update Response Parsing Protocol Tests #3247

Conversation

jonathan343 commented Sep 6, 2024 • edited Loading

Summary

Granular Protocol Ignore List

codecov-commenter commented Sep 6, 2024 • edited Loading

Codecov Report

nateprewitt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jonathan343 commented Sep 6, 2024 •

edited

Loading

codecov-commenter commented Sep 6, 2024 •

edited

Loading